Skip to content

test: add retry logic for eventloopdelay histogram sampling#61796

Open
mcollina wants to merge 2 commits into
nodejs:mainfrom
mcollina:fix-flaky-test-performance-eventloopdelay-v2
Open

test: add retry logic for eventloopdelay histogram sampling#61796
mcollina wants to merge 2 commits into
nodejs:mainfrom
mcollina:fix-flaky-test-performance-eventloopdelay-v2

Conversation

@mcollina
Copy link
Copy Markdown
Member

Summary

Fixes flaky test-performance-eventloopdelay test on sharedlibs builds.

Problem

On some build configurations (e.g., Ubuntu 24.04 sharedlibs), the histogram may not record valid samples (min > 0) after the initial spinning period. The test fails with:

AssertionError [ERR_ASSERTION]: assert(histogram.min > 0)

The previous fix (PR #61629) added setImmediate to allow samples to be recorded, but this wasn't sufficient for all configurations.

Solution

Add a retry mechanism that checks if valid samples have been recorded (count > 0 && min > 0 && max > 0). If not, the test will spin the event loop additional times (up to 3 retries) before asserting.

This ensures the test passes on slower or differently-configured systems where event loop delay sampling may take longer to produce measurable results.

Verification

Tested locally with python3 tools/test.py --repeat 30 - all passed.

Refs: nodejs/reliability#1461

@nodejs-github-bot nodejs-github-bot added needs-ci PRs that need a full CI run. test Issues and PRs related to the tests. labels Feb 12, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 12, 2026

Codecov Report

βœ… All modified and coverable lines are covered by tests.
βœ… Project coverage is 89.71%. Comparing base (4a13a62) to head (64c291a).
⚠️ Report is 701 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #61796      +/-   ##
==========================================
- Coverage   89.76%   89.71%   -0.05%     
==========================================
  Files         675      675              
  Lines      204674   204797     +123     
  Branches    39330    39348      +18     
==========================================
+ Hits       183716   183734      +18     
- Misses      13235    13336     +101     
- Partials     7723     7727       +4     

see 52 files with indirect coverage changes

πŸš€ New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • πŸ“¦ JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

On some build configurations (e.g., sharedlibs), the histogram may not
record valid samples (min > 0) after the initial spinning period. This
adds a retry mechanism that will spin the event loop additional times
if valid samples haven't been recorded yet.

This helps ensure the test passes on slower or differently-configured
systems where event loop delay sampling may take longer to produce
measurable results.

Refs: nodejs/reliability#1461
@mcollina mcollina force-pushed the fix-flaky-test-performance-eventloopdelay-v2 branch 2 times, most recently from a0b9aa0 to ad1c0a7 Compare February 13, 2026 10:43
Extract assertions into separate function and simplify control flow.
Use setImmediate for final assertions to ensure histogram has
recorded all samples.
@mcollina mcollina force-pushed the fix-flaky-test-performance-eventloopdelay-v2 branch from ad1c0a7 to 64c291a Compare February 14, 2026 16:48
@richardlau richardlau added the request-ci Add this label to start a Jenkins CI on a PR. label May 11, 2026
@github-actions github-actions Bot removed the request-ci Add this label to start a Jenkins CI on a PR. label May 11, 2026
@nodejs-github-bot
Copy link
Copy Markdown
Collaborator

@mcollina mcollina added commit-queue Add this label to land a pull request using GitHub Actions. and removed needs-ci PRs that need a full CI run. labels May 11, 2026
@nodejs-github-bot nodejs-github-bot added commit-queue-failed An error occurred while landing this pull request using GitHub Actions. and removed commit-queue Add this label to land a pull request using GitHub Actions. labels May 11, 2026
@nodejs-github-bot
Copy link
Copy Markdown
Collaborator

Commit Queue failed
- Loading data for nodejs/node/pull/61796
βœ”  Done loading data for nodejs/node/pull/61796
----------------------------------- PR info ------------------------------------
Title      test: add retry logic for eventloopdelay histogram sampling (#61796)
Author     Matteo Collina <matteo.collina@gmail.com> (@mcollina)
Branch     mcollina:fix-flaky-test-performance-eventloopdelay-v2 -> nodejs:main
Labels     test
Commits    2
 - test: add retry logic for eventloopdelay histogram sampling
 - test: refactor eventloopdelay test for clarity
Committers 1
 - Matteo Collina <hello@matteocollina.com>
PR-URL: https://github.com/nodejs/node/pull/61796
Refs: https://github.com/nodejs/reliability/issues/1461
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Paolo Insogna <paolo@cowtech.it>
Reviewed-By: Richard Lau <richard.lau@ibm.com>
------------------------------ Generated metadata ------------------------------
PR-URL: https://github.com/nodejs/node/pull/61796
Refs: https://github.com/nodejs/reliability/issues/1461
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Paolo Insogna <paolo@cowtech.it>
Reviewed-By: Richard Lau <richard.lau@ibm.com>
--------------------------------------------------------------------------------
   β„Ή  This PR was created on Thu, 12 Feb 2026 22:07:51 GMT
   βœ”  Approvals: 4
   βœ”  - James M Snell (@jasnell) (TSC): https://github.com/nodejs/node/pull/61796#pullrequestreview-3797902216
   βœ”  - Colin Ihrig (@cjihrig): https://github.com/nodejs/node/pull/61796#pullrequestreview-3797906488
   βœ”  - Paolo Insogna (@ShogunPanda) (TSC): https://github.com/nodejs/node/pull/61796#pullrequestreview-4261607012
   βœ”  - Richard Lau (@richardlau) (TSC): https://github.com/nodejs/node/pull/61796#pullrequestreview-4263460373
   βœ”  Last GitHub CI successful
   β„Ή  Last Full PR CI on 2026-05-11T12:08:59Z: https://ci.nodejs.org/job/node-test-pull-request/73370/
- Querying data for job/node-test-pull-request/73370/
βœ”  Build data downloaded
   βœ”  Last Jenkins CI successful
--------------------------------------------------------------------------------
   βœ”  No git cherry-pick in progress
   βœ”  No git am in progress
   βœ”  No git rebase in progress
--------------------------------------------------------------------------------
- Bringing origin/main up to date...
From https://github.com/nodejs/node
 * branch                  main       -> FETCH_HEAD
βœ”  origin/main is now up-to-date
- Downloading patch for 61796
From https://github.com/nodejs/node
 * branch                  refs/pull/61796/merge -> FETCH_HEAD
βœ”  Fetched commits as 0a60e9013bb9..64c291afb908
--------------------------------------------------------------------------------
[main 9caf8f098e] test: add retry logic for eventloopdelay histogram sampling
 Author: Matteo Collina <hello@matteocollina.com>
 Date: Thu Feb 12 23:07:10 2026 +0100
 1 file changed, 62 insertions(+), 45 deletions(-)
[main 5d9d4fb500] test: refactor eventloopdelay test for clarity
 Author: Matteo Collina <hello@matteocollina.com>
 Date: Sat Feb 14 17:47:16 2026 +0100
 1 file changed, 48 insertions(+), 45 deletions(-)
   βœ”  Patches applied
There are 2 commits in the PR. Attempting autorebase.
(node:396) [DEP0190] DeprecationWarning: Passing args to a child process with shell option true can lead to security vulnerabilities, as the arguments are not escaped, only concatenated.
(Use `node --trace-deprecation ...` to show where the warning was created)
Rebasing (2/4)

οΏ½[KExecuting: git node land --amend --yes
⚠ Found Refs: https://github.com/nodejs/reliability/issues/1461, skipping..
--------------------------------- New Message ----------------------------------
test: add retry logic for eventloopdelay histogram sampling

On some build configurations (e.g., sharedlibs), the histogram may not
record valid samples (min > 0) after the initial spinning period. This
adds a retry mechanism that will spin the event loop additional times
if valid samples haven't been recorded yet.

This helps ensure the test passes on slower or differently-configured
systems where event loop delay sampling may take longer to produce
measurable results.

Refs: nodejs/reliability#1461
PR-URL: #61796
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Paolo Insogna <paolo@cowtech.it>
Reviewed-By: Richard Lau <richard.lau@ibm.com>

[detached HEAD e60a0ec86a] test: add retry logic for eventloopdelay histogram sampling
Author: Matteo Collina <hello@matteocollina.com>
Date: Thu Feb 12 23:07:10 2026 +0100
1 file changed, 62 insertions(+), 45 deletions(-)
Rebasing (3/4)
Rebasing (4/4)

οΏ½[KExecuting: git node land --amend --yes
--------------------------------- New Message ----------------------------------
test: refactor eventloopdelay test for clarity

Extract assertions into separate function and simplify control flow.
Use setImmediate for final assertions to ensure histogram has
recorded all samples.

PR-URL: #61796
Refs: nodejs/reliability#1461
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Paolo Insogna <paolo@cowtech.it>
Reviewed-By: Richard Lau <richard.lau@ibm.com>

[detached HEAD f8d7aba6d5] test: refactor eventloopdelay test for clarity
Author: Matteo Collina <hello@matteocollina.com>
Date: Sat Feb 14 17:47:16 2026 +0100
1 file changed, 48 insertions(+), 45 deletions(-)

οΏ½[KSuccessfully rebased and updated refs/heads/main.

β„Ή Add commit-queue-squash label to land the PR as one commit, or commit-queue-rebase to land as separate commits.

https://github.com/nodejs/node/actions/runs/25685898940

@mcollina mcollina added commit-queue Add this label to land a pull request using GitHub Actions. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. and removed commit-queue-failed An error occurred while landing this pull request using GitHub Actions. labels May 11, 2026
@nodejs-github-bot nodejs-github-bot added commit-queue-failed An error occurred while landing this pull request using GitHub Actions. and removed commit-queue Add this label to land a pull request using GitHub Actions. labels May 11, 2026
@nodejs-github-bot
Copy link
Copy Markdown
Collaborator

Commit Queue failed
- Loading data for nodejs/node/pull/61796
βœ”  Done loading data for nodejs/node/pull/61796
----------------------------------- PR info ------------------------------------
Title      test: add retry logic for eventloopdelay histogram sampling (#61796)
Author     Matteo Collina <matteo.collina@gmail.com> (@mcollina)
Branch     mcollina:fix-flaky-test-performance-eventloopdelay-v2 -> nodejs:main
Labels     test, commit-queue-squash
Commits    2
 - test: add retry logic for eventloopdelay histogram sampling
 - test: refactor eventloopdelay test for clarity
Committers 1
 - Matteo Collina <hello@matteocollina.com>
PR-URL: https://github.com/nodejs/node/pull/61796
Refs: https://github.com/nodejs/reliability/issues/1461
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Paolo Insogna <paolo@cowtech.it>
Reviewed-By: Richard Lau <richard.lau@ibm.com>
------------------------------ Generated metadata ------------------------------
PR-URL: https://github.com/nodejs/node/pull/61796
Refs: https://github.com/nodejs/reliability/issues/1461
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Paolo Insogna <paolo@cowtech.it>
Reviewed-By: Richard Lau <richard.lau@ibm.com>
--------------------------------------------------------------------------------
   β„Ή  This PR was created on Thu, 12 Feb 2026 22:07:51 GMT
   βœ”  Approvals: 4
   βœ”  - James M Snell (@jasnell) (TSC): https://github.com/nodejs/node/pull/61796#pullrequestreview-3797902216
   βœ”  - Colin Ihrig (@cjihrig): https://github.com/nodejs/node/pull/61796#pullrequestreview-3797906488
   βœ”  - Paolo Insogna (@ShogunPanda) (TSC): https://github.com/nodejs/node/pull/61796#pullrequestreview-4261607012
   βœ”  - Richard Lau (@richardlau) (TSC): https://github.com/nodejs/node/pull/61796#pullrequestreview-4263460373
   βœ”  Last GitHub CI successful
   β„Ή  Last Full PR CI on 2026-05-11T17:29:59Z: https://ci.nodejs.org/job/node-test-pull-request/73370/
- Querying data for job/node-test-pull-request/73370/
βœ”  Build data downloaded
   βœ”  Last Jenkins CI successful
--------------------------------------------------------------------------------
   βœ”  No git cherry-pick in progress
   βœ”  No git am in progress
   βœ”  No git rebase in progress
--------------------------------------------------------------------------------
- Bringing origin/main up to date...
From https://github.com/nodejs/node
 * branch                  main       -> FETCH_HEAD
βœ”  origin/main is now up-to-date
- Downloading patch for 61796
From https://github.com/nodejs/node
 * branch                  refs/pull/61796/merge -> FETCH_HEAD
βœ”  Fetched commits as 0a60e9013bb9..64c291afb908
--------------------------------------------------------------------------------
[main 36fc919cb8] test: add retry logic for eventloopdelay histogram sampling
 Author: Matteo Collina <hello@matteocollina.com>
 Date: Thu Feb 12 23:07:10 2026 +0100
 1 file changed, 62 insertions(+), 45 deletions(-)
[main cca689e6e5] test: refactor eventloopdelay test for clarity
 Author: Matteo Collina <hello@matteocollina.com>
 Date: Sat Feb 14 17:47:16 2026 +0100
 1 file changed, 48 insertions(+), 45 deletions(-)
   βœ”  Patches applied
There are 2 commits in the PR. Attempting to fixup everything into first commit.
[main fb48362ff4] test: add retry logic for eventloopdelay histogram sampling
 Author: Matteo Collina <hello@matteocollina.com>
 Date: Thu Feb 12 23:07:10 2026 +0100
 1 file changed, 66 insertions(+), 46 deletions(-)
   ⚠  Found Refs: https://github.com/nodejs/reliability/issues/1461, skipping..
--------------------------------- New Message ----------------------------------
test: add retry logic for eventloopdelay histogram sampling

On some build configurations (e.g., sharedlibs), the histogram may not
record valid samples (min > 0) after the initial spinning period. This
adds a retry mechanism that will spin the event loop additional times
if valid samples haven't been recorded yet.

This helps ensure the test passes on slower or differently-configured
systems where event loop delay sampling may take longer to produce
measurable results.

Refs: nodejs/reliability#1461
PR-URL: #61796
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Paolo Insogna <paolo@cowtech.it>
Reviewed-By: Richard Lau <richard.lau@ibm.com>

[main 59528f1c9f] test: add retry logic for eventloopdelay histogram sampling
Author: Matteo Collina <hello@matteocollina.com>
Date: Thu Feb 12 23:07:10 2026 +0100
1 file changed, 66 insertions(+), 46 deletions(-)
βœ– 59528f1c9fde8b832bf42dc30de1b13fdc833dcc
βœ” 0:0 no Assisted-by metadata assisted-by-is-trailer
βœ” 0:0 no Co-authored-by metadata co-authored-by-is-trailer
βœ” 0:0 skipping fixes-url fixes-url
βœ” 0:0 blank line after title line-after-title
βœ” 0:0 line-lengths are valid line-length
βœ” 0:0 metadata is at end of message metadata-end
βœ” 11:8 PR-URL is valid. pr-url
βœ” 0:0 reviewers are valid reviewers
βœ– 0:0 Commit must have a "Signed-off-by" trailer signed-off-by
βœ” 0:0 valid subsystems subsystem
βœ” 0:0 Title is formatted correctly. title-format
⚠ 0:50 Title should be <= 50 columns. title-length

β„Ή Please fix the commit message and try again.
Please manually ammend the commit message, by running
git commit --amend
Once commit message is fixed, finish the landing command running
git node land --continue

https://github.com/nodejs/node/actions/runs/25688115197

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

commit-queue-failed An error occurred while landing this pull request using GitHub Actions. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. test Issues and PRs related to the tests.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants